Marking of electronic documents in order to expose unauthorized publication

ABSTRACT

A method of marking information to be produced in order to expose unauthorized copying. In the method, a user identifier is defined, a source file containing a substantial part of the information produced is created, and a different target file is created for each user on the basis of the source file and the user identifier. In the invention, several predefined modification rules are stored in a file in order to modify the information contained in the source file. Using a digital processor, a sequence of modification keys is generated on the basis of the user identifier, the locations of the source file to be modified are defined, and modifications are made in these locations, determining the nature and/or position of the modifications on the basis of the predefined modification rules and modification keys. From the content of the target file delivered to the user, it is possible to conclude a user-specific identifier and thereby the person responsible for the unauthorized copy.

BACKGROUND OF INVENTION

The invention relates to a technique by which unauthorized publicationand copying of electronic information can be exposed. The electronicinformation means particularly documents and programs distributed in anInternet type network.

The copying of electronically distributed information is technicallyvery simple. For example, illegal copying of computer programs causesannual losses of milliards of dollars to software industry. Piratecopies are also a major problem in video and recording industry.

The common feature of all such information is that any protection of theinformation is based on copyright (or closely related forms ofprotection, such as data base protection), an agreement, ornon-disclosure obligation. As previously known, the essential differencebetween copyright and industrial protection—such as patent and designprotection—is that in industrial protection infringement is sufficientlyshown if products can be shown to be similar, whereas to showinfringement of copyright it must be shown that a product has beencopied rather than created independently. The same applies not only tocopyright, but also to infringement of commercial secrets and othernon-disclosure obligations. Such infringement is particularly difficultto prove in respect of databases where information is, in principle,available to anyone, and any additional value produced by theinformation provider is based on advantageous selection or presentationof information. In the present application, the term ‘copyright’ shouldthus be understood in the wide sense to comprise commercial secrets,non-disclosure obligations, etc. as well as the actual author's rights.Correspondingly, a ‘copyright proprietor’ here refers to anyone whowants to prevent and/or expose unauthorized copying and/or publicationof information.

Even if the copyright proprietor is able to show that the product hasbeen copied without permission, he must also be able to indicate who isresponsible for the copying. If an unauthorized copy is found in thepossession of an end user who has bought the product in good faith, nocompensation can usually be required from the end user. Further, the enduser cannot be required to be able to or willing to expose who has soldthe product.

For example, software suppliers use technology in which the softwareasks the user for his name and contact information in connection withthe first installation. The information can be encoded in protected formand stored on an installation diskette. On the basis of this informationthe installation program produces a client identifier, which the clientneeds as he phones the supplier's telephone support. By monitoringincoming calls and their client identifiers the supplier may be able toexpose unauthorized copying. The technology, however, has many defectsand restrictions. For example, the technology is based on the assumptionthat the user will at some point need telephone support. This, however,is not always a correct assumption. Most information can be used evenwithout obtaining subsequent support from the supplier. Since uponinstalling the software, the dishonest user can supply any contactinformation whatsoever, it is not always possible to conclude from theunauthorized copy where the copy has been made. By comparinginstallations conducted with different user information, theunauthorized user can conclude where on the disk the user-specificinformation is located, and change the information. The technique cannotbe applied to protecting text and image files, since the addedinformation is easy to delete.

It is also known to slightly modify e.g. the character spacing, fontsize, etc. in text documents. The idea is that the user will not noticeif there are, for example, a few 11-point characters amidst 12-pointtext. This solution is previously known e.g. from U.S. Pat. No.5,467,447. The technology, however, cannot be used for protectingelectronically distributed text, since the unauthorized user can simplyimpose the same layout on all text. Even if a so protected text isdelivered on paper, the dishonest user can supply the text to a textscanner, which removes all extra formatting.

BRIEF DESCRIPTION OF INVENTION

The object of the invention is thus to develop a technique by whichunauthorized copying of electronically distributed information and theperson responsible for the copying can be exposed. The objects of theinvention are achieved with a technique that is characterized by what isstated in the independent claims. The preferred embodiments of theinvention are claimed in the dependent claims.

The invention is based on the idea that the electronically distributedinformation is marked, i.e. modified to be different for each user. Themodifications are not based on formatting, which can be easily changed,but are buried deep in the bit stream that carries the information.Differences are introduced in so many places in the versions deliveredto different users that the unauthorized user cannot detect all thedifferences, or that it is not economically sensible to attempt todetect all the differences.

A known technique in which the information to be produced is marked inorder to expose unauthorized publication comprises the steps of

defining a user identifier,

creating a source file containing a substantial part of the informationproduced, and

creating a different target file for each user on the basis of thesource file and the user identifier.

The method of the invention further comprises the steps of

storing, in a file, several predefined modification rules for modifyingthe information contained in the source file, and

by means of a digital processor:

generating a sequence of modification keys on the basis of the useridentifier,

defining the locations in the source file that are to be modified andmaking modifications in these locations, determining the nature and/orposition of the modifications on the basis of the predefinedmodification rules and modification keys.

An advantage of the technique of the invention is that a user-specificidentifier and thereby the person responsible for the unauthorized copycan be concluded from the information delivered to the user. Theadvantage obtained by keeping the modifications small is that theinformation content remains intact and that the unauthorized user willnot detect the modifications. When modifications are made in a largenumber of various locations, the copying and the copier can be exposedeven if the information is copied only in part.

BRIEF DESCRIPTION OF FIGURES

In the following the invention will be described in greater detail bymeans of preferred embodiments illustrated in the attached drawings, inwhich

FIG. 1 is a block diagram illustrating a technique of the invention,

FIG. 2 illustrates,generating a modification key on the basis of a useridentifier,

FIGS. 3 to 5 illustrate modifying a source file on the basis ofmodification keys, and

FIG. 6 illustrates delivery of documents through the Internet.

DETAILED DESCRIPTION OF INVENTION

With reference to FIG. 1, we shall now describe different ways ofimplementing the technique of the invention in greater detail. Let usfirst define some of the terms used in the present application.

A ‘document’ is a general term referring to material at least someversion of which is to be delivered to the PC user. The document cancomprise text or numerical information, audio or video information, acomputer program, or a combination of any of these.

A ‘source’ in phrases like ‘source material’ and ‘source document’refers to material that is not protected and that would be delivered tothe user in prior art solutions.

A ‘target’ in e.g. ‘target document’ refers to material that isprotected by modifying it specifically for each user.

A ‘file’ means that the material is in a computer-readable form.

In FIG. 1 the source material provider (copyright proprietor) forms asource file 10 from the product. The function of a user identificationblock 12 is to define a user identifier. A sequence generator 14generates sequences of modification keys on the basis of the useridentifier, generating a different sequence for each user. Amodification block 16, which receives an input of a source file 10 and auser-specific sequence, forms from them a user-specific target file 17.The file is delivered through a distribution channel 18. In thefollowing the blocks and functions will be described in greater detail.

The user identification block 12 generates a user identifier 12 a. Inthe present invention, any technique whatsoever known to the personskilled in the art can be used for identifying the user and generatingthe identifier 12 a. The reliability of the identification techniqueshould correspond to the importance of the information to be protected.In the field of electronic payment transfers, for example, it is commonto use techniques that are based on client identifiers and passwordsthat are valid only once. The structure and operation of theidentification block 12 depends on the transmission path on which thecommunication takes place, i.e. the user identifier 12 a is received andthe target document 17 is delivered. If the document is delivered bymail, e.g. on a diskette, the user identifier 12 a can be based, forexample, on the user's name or the number of his credit card. In theInternet, in particular, much information is distributed free of charge,whereby the information provider profits e.g. by gaining publicity. In acase where no actual user identification takes place, information aboutthe user's identity can be obtained on the basis of the electronicaddress, such as the TCP/IP address or e-mail address. Whereparticularly valuable information is concerned, the user identifier canbe checked e.g. by a call-back procedure.

The sequence generator 14 receives, as an input, a user identifier 12 a,and generates a sequence 15 that is different for each user. Thesequence 15 can be, for example, a series of pseudo random numbers. Asimple way of forming pseudo random numbers is to raise a 2N-digit seednumber to the power of two, to take N middle digits of the numberobtained, and to raise them to the power of two, and so on. The processis illustrated in FIG. 2, where the arrow pointing downward indicateswhere the raising to the power of two is conducted. For example, whenthe user identifier 12 a, which has the value 12345678, is raised to thepower of two, and the four middle digits of the number are taken, thefirst pseudo random number 15 a—having the value 5765—is obtained. Whenthis number is raised to the power of two and the four middle numbersare taken, the second pseudo random number 15 b—having the value 2352—isobtained, and so on. If in such a sequence is obtained a number in whichall the four middle digits are zeros, then the sequence degenerates tozero. This can be avoided e.g. by interpreting the N-digit number as abit string, which is then divided into parts and between the parts areplaced ‘1’ bits.

As an alternative to the embodiment of FIG. 2, the user identifier 12acan be interpreted directly as a bit string from which a certain numberof bits are used at each modification. When all the bits have been used,one starts to use the bit string all over again. If, for example, theuser identifier has 40 bits and three bits are used for forming eachmodification key, then the 14th modification key can consist of the lastbit, after which the first two bits can be used again, and so on.

The source file 10 created contains or is associated with irrelevantinformation, or redundancy, which can be modified individually for eachuser. Image and voice information always contains such redundancy. Forexample, in the loudest moments of a piece of music, the human earcannot distinguish whether the least significant bit is modified or not.In a movie, the position of two frames can be changed without that theaudience notices anything, and so on. Further, the elements of drawingscan be in a different order in a file, and any markings in the drawingscan be in slightly different positions or be of different size.

A text file comprises notably less redundancy. Likewise, the informationrate is much lower. The information content of a typical type-writtenpage is 2 kilobytes, and it takes several minutes to read it, whereasthe information rate of CD-quality music is 150 kilobytes per second,and that of a TV signal is several megabytes per second. Redundancy,however, can be added to a text file by utilizing the fact that manywords have synonyms, that the word order can sometimes be changed, etc.Small and capital letters can be varied, e.g. GSM, Gsm or gsm. Somewords also have alternative spellings, such as ‘disc’ or ‘disk’. Smallletter/can sometimes be replaced by number 1, and capital O by 0, orvice versa. The order of blocks in computer programs can vary in a file.Even the instructions of the computer programs have several alternativeforms: for example, the addition of 1 or subtraction of −1 give the sameresult.

Alternative bit strings can thus be added to a text file or to a sourcefile of a computer program, for example as follows: {alternative1/alternative 2/ . . . /alternative N}.

The modification block 16 receives, as inputs, a source file 10 and auser-specific sequence 15. On the basis of the source file 10 and thesequence 15, the block creates a different target file 17 for each user.At each alternative bit string, the modification block 16 converts anelement 15 a-15 e of the sequence 15 to a number range corresponding tothe number N of the alternatives. The processing is described in greaterdetail in FIG. 3. The modification block 16 reads a sentence 30 from thesource file 10. The sentence, which serves as an example, contains twolists of alternative expressions 31 and 32, the former being hereexamined in greater detail. Reference 31 a indicates where alternativelist 31 begins. In this example, lists 31 and 32 begin with the sign {.Reference 31 b indicates the first number of list 31, which shows howmany of the alternatives of list 31 are used. If the alternative list 31or 32 does not indicate the number of bit strings to be used, then oneof the alternative bit strings is used. Here the list contains threealternatives 31 c-31 e, all of which are used. Reference 31 f indicatesa delimiter (here the sign /) between alternative bit strings, andreference 31 g indicates the end of the list (here the sign }).

At each alternative list 31 and 32, the modification block 16 reads therandom numbers 15 a-15 e, etc. and converts them to a number rangecorresponding to the number of alternatives. In the example of FIG. 2,the random numbers 15 a-15 e vary within the range 0 to 9999, and thefirst list of alternatives 31 comprises three alternatives. The range ofthe random numbers can simply be divided into three parts, the limitsbeing 3333 and 6666. The first random number 15 a is within the range3333 to 6666, so the elements of the first list of alternatives 31 canbe used in the order second, third, first. In the second alternativelist 32 there are two alternatives, of which one is used. The value ofthe second random number 15 b is 2352, which is less than half of therange (there are two alternatives). From the second list of alternatives32, the first element is used. The alternative bit strings need not allbe equally probable. In particular, when an alternative bit string isbased on a misspelling made on purpose, the correct form can be selectedwith a high—e.g. 90 to 99%—probability. In alternative lists 31 and 32,for example, it is possible to use an indication (e.g. a differentdelimiter 31 f) showing that some of the alternatives are misspellingsthat are highly unlikely to be selected.

The number of possible permutations can be increased e.g. in such a waythat when the random number 15 a, 15 b, etc. meets a certain additionalcondition, the list of alternatives is read from right to left. Theadditional condition can be, for example, that the random number is aneven number or that it is closer to the upper limit of the part than tothe lower limit. With only three alternatives, six permutations are thusobtained. In FIG. 3 reference 33 indicates that even one simple sentencegenerates 12 different permutations.

FIG. 4 shows an embodiment in which the lists of alternative bit stringsare not combined with the source file 10, as in the embodiment of FIG.3. In FIG. 4 source file 10 a comprises only substantial information,exemplified by sentence 40. The list of alternative bit strings isstored in its own file 10 b, which contains files 10 b 1, 10 b 2, 10 b3, etc. In this embodiment, the modification block 16 processes sourcefile 10 a. At sentence 40, which serves as an example, the modificationblock 16 recognizes bit strings 40 a, 40 b and 40 c, which are alsostored in alternative bit string file 10 b. As the recognition takesplace, the modification block 16 modifies the information read fromsource file 10 b, and creates a target file 17 in almost the same way asdescribed in connection with FIG. 3. Even here 12 permutations can begenerated from a simple sentence 40 by three alternative bit strings 10b 1-10 b 3.

The embodiments of FIGS. 3 and 4 can also be used together, as shown inFIG. 5. In this embodiment, block 51 automatically identifies those bitsstrings (here 40 a-40 c) of the sentences (which are exemplified bysentence 40) of source file 10 a for which an alternative bit string isdefined in records 10 b 1-10 b 3 of file 10 b. Block 51 automaticallycreates a source file 10 whose content is of the same form as sentence30 in FIG. 3. Such an automatically generated, combined source file canbe supplemented manually before it is supplied to the modification block16 according to the invention.

In FIGS. 3 and 4 the sentences 30 and 40 to be modified are simplesentences written in English. The sole reason for this is to make theoperation of the invention more readily understandable. It is thus notessential to the invention what kind of information the files 10, 10 aand 10 b contain. The modification block 16 operates by simple mechanicrules of conclusion, and it does not understand the content of thesource file 10 or target file 17. With respect to the modification block16, the lists of alternatives 31, 32, etc. can be any bit stringswhatsoever. In fact, the different permutations of sentence 30 in FIG. 3are not exact synonyms of one another; their information content,however, differs so little that the difference is not significant.

An alternative for storing possible modifications beforehand either inthe source file 10 or in a separate modification file 10 b is that themodifications are made on the basis of a suitable algorithm uponcreating each separate target file 17, whereby the modifications can bestored in the modification block, or modification program 16. Forexample, let us assume that the supplier wants to protect a productcatalogue intended for retailers, the catalogue possibly containingconfidential information. The product catalogue contains productnumbers, and to the corresponding bit strings extra redundancy can beadded by interleaving in them bits formed from modification keys. With a10-bit code, 1024 retailers can be separated from one another. Forexample, let the product number be 98765, the 20-bit binary presentationof which is 00011000000111001101. For example, one bit derived from themodification key can thus be interleaved between two bits of the productnumber. Each retailer would then receive a product list with individualproduct numbers. When a retailer makes an order, the extra bits aredeleted automatically. If such a list of products is found in thepossession of an unauthorized user, the origin of the confidentialinformation can be concluded from the bits added.

In the above embodiments the nature of the modification depends on themodification keys 15 a-15 e generated on the basis of the useridentifier 12 a. Alternatively, the modification keys can define theposition of the modification rather than the nature of the modification.For example, a repetitive modification can be generated in a videosignal, the distance between the modifications being defined on thebasis of the modification keys 15 a-15 e. This kind of modificationmeans, for example, that the mutual order of two video frames ischanged, or that the bits of a video frame are modified so that thechecksum calculated from the bits is a predefined number, such as zero.

The structure of the distribution channel 17 is usually dependent on theuser identification block 12. If the user orders a product by mail, thedistribution channel 17 refers, for example, to the delivery of adiskette by mail. Particular attention should be paid to distribution ofelectronic documents in an electric network, such as the Internet. Thesyntax of the lists of alternatives 31 and 32 described in FIG. 3 can beimplemented in an Internet server, for example as described below.

Let us now study FIG. 6. In the prior art, information suppliersdistribute information via servers, which are commonly called Webservers and are represented by a computer indicated by reference 63 inFIG. 6. The PC user communicates with the Internet via a browser. Theconnection can be established, for example, via a modem 60 to thenetwork operator's communications server 61. Let us assume that the PCuser wants to establish a connection with a Web server 63 having anidentifier of the form ‘http://www.xxx.yy’, where xxx is a firmidentifier, and yy is a specifier, such as a country code. The networkforwards the request to a Domain Name Server (DNS) 62, which tells theTCP/IP address of server 63, on the basis of which the networkestablishes a connection from the user's PC to server 63. The Web server63 sends information to the user's PC, the information usually being aHyperText Markup Language (HTML) document. The service provider(copyright proprietor) can publish information in the Internet, forexample, such that the Web server 63 is connected through a local areanetwork 65 to other computers 64 of the same firm, which produce theinformation. Alternatively, the server 63 can be the network operator'scomputer, whereby the service provider can maintain the informationcontained in the server 63 either by sending diskettes to the operatoror via a connection 60-61 like the one on which the PC user communicateswith the Internet. The documents transmitted through the Internet cancontain simple text or entire multimedia programs.

When the user has found interesting information, he can give the browserthe command ‘print’ or ‘save’, whereby the printer prints theinformation or the information is stored in the memory. The storing isproblematic for the information provider, for the browsers usually storethe information with the HTML commands. The lists of alternative bitstrings described in connection with FIGS. 3 to 5 must not be forwardedto the user under any circumstances whatsoever. In the invention, theoperation of the Web server is expanded as follows. The HTML language isexpanded by an additional command that can be e.g. the verb ‘pick’.Sentence 30 of FIG. 3, expanded with the HTML in accordance with thepresent invention, would thus read as follows:

You are in a <pick 3, ‘little’, ‘maze of’, ‘twisty’>

<pick 1, ‘corridors’, ‘passages’>, all different.

As regards the verb ‘pick’, the Web server expanded in accordance withthe invention generates a different sentence for each user, as describedin connection with FIG. 3. The Web server forwards only the finalsentence to the user via the Internet, the final sentence in the exampleof FIG. 3 being one of sentences 33, from which any occurrences of theverb ‘pick’ and any unused alternatives have been deleted.

The Internet also comprises proxy servers, which are not shownseparately in FIG. 6. The proxy servers reduce the load on internationalconnections by storing the last-read pages of information in the memory.If the same page is read several times in succession, the page is readonce from the information provider's server, but after that from theproxy server. There is a risk that if there are two users A and B andthe latter copies information without authorization, the informationcontent may refer to user A. In fact, the information copied withoutauthorization does not include anything that would refer to thedishonest user B. However, user B can probably be exposed on the basisof the log file maintained by the operator. It is possible to concludefrom the information copied without authorization who the informationhas been delivered to. Proxy servers usually have a time limit, e.g. 24hours, and no information that is older than that is kept in the memory.It is then possible to conclude from the log who have requestedinformation substantially simultaneously, so the number of suspectedusers is limited to a very small group of users.

It is a time-consuming process to prove an alleged copyrightinfringement in court. Technology can be used for preventing newinfringements such that the Web server is provided with a list ofblocked users that prevents (e.g. on the basis of the electronicaddress) creation of a service to a user that is suspected of havingpublished information without authorization.

The copyright can be later proved more easily if it is possible toshow—without any doubt—that the source file has been in existence on acertain day. This can be proved, for example, by calculating from thesource file a multibyte cyclic checksum using a known algorithm, andpublishing the checksum in a means of communication.

When the material provider later detects a document that he suspects tohave been copied illegally, he can find out who the information inquestion has been delivered to, e.g. by testing which user identifiergives a document that is identical to the one that is suspected to be apirate copy. 20 Since it is also possible that the user has modified thedocument he has ‘borrowed’, it is possible to define, for example, acertain coefficient of correlation or some other corresponding thresholdvalue. To expose an unauthorized copy, it is not necessary that a wholedocument has been copied: copying of even part of a document issufficient. As shown above, even one simple sentence can yield 12permutations. Two such sentences yield 144 permutations, and sixsentences yield about three million. Since the copyright allowsreasonable borrowing, it is not worthwhile to even try and exposesmall-scale borrowing.

Since modifications have to be made in several locations of the sourcefile, it is evident that the only reasonable way to make themodifications is to use a computer or some other digital processor. Thenumber of modifications needed must be estimated specifically for eachcase. As a rule, there must be so many modifications that evenlarge-scale modification of a copied document can be exposed or at leastmade uneconomic. In the long run, the information provider will have alarge library of alternative expressions, which can be utilized invarious documents. If all the alternative expressions and othermodifications are made by the same author as the actual text to beprotected, the author owns the copyright on all the computer-mademodifications. The computer and software do thus not add anycontributions of their own that could interfere with the idea of thecopyright.

It is naturally advantageous to use all the above techniques indifferent combinations. In multimedia documents, which comprise voiceand/or video information as well as text, the different parts can beprotected in different ways. Even in other respects it will be obviousto the person skilled in the art that as technology advances, the basicidea of the invention can be implemented in many different ways. Theinvention and its embodiments are thus not limited to the above examplesbut can vary within the scope of the claims.

What is claimed is:
 1. A method of marking information to be produced inorder to expose unauthorized publication of said information, the methodcomprising the steps of: defining a user identifier, creating a sourcefile containing said information to be marked, and creating a differenttarget file for each user on the basis of the source file and the useridentifier, storing, in a file, predefined modification rules formodifying the information contained in the source file, and, by means ofa digital processor, generating a sequence of modification keys on thebasis of the user identifier, and defining the locations in the sourcefile to be modified and making modifications in these locations,determining at least one of the nature and position of the modificationson the basis of the predefined modification rules and modification keys;wherein the source file comprises a first part containing actualinformation, and a second part containing formatting instructions, andthat modifications are made in said first part.
 2. A method according toclaim 1, wherein a number of predefined modifications are stored in thesource file.
 3. A method according to claim 1, wherein a number ofpredefined modifications are stored in a separate modification file. 4.A method according to claim 1, wherein a number of predefinedmodification rules are stored in a modification program.
 5. A methodaccording to claim 1, wherein the modifications are stored in the sourcefile as bit strings comprising at least a start character, a stopcharacter, and a number of alternative bit strings, from which at leastone is written into the source file at the modification concerned, andthat when the source file is modified to form a target file, thepositions of the modifications are determined on the basis of thepositions of the bit strings in the information contained in the sourcefile.
 6. A method according to claim 1, wherein the actual informationis human-readable plaintext.
 7. A method of marking information to beproduced in order to expose unauthorized publication of saidinformation, the method comprising the steps of: defining a useridentifier, creating a source file containing said information to bemarked, and creating a different target file for each user on the basisof the source file and the user identifier, storing, in a file,predefined modification rules for modifying the information contained inthe source file, and, by means of a digital processor, generating asequence of modification keys on the basis of the user identifier, anddefining the locations in the source file to be modified and makingmodifications in these locations, determining at least one of the natureand position of the modifications on the basis of the predefinedmodification rules and modification keys; wherein the modifications arestored in the source file as bit strings comprising at least a startcharacter, a stop character, and a number of alternative bit stringsfrom which at least one is written into the source file at themodification concerned, and that when the source file is modified toform a target file, the positions of the modifications are determined onthe basis of the positions of the bit strings in the informationcontained in the source file; and wherein the modifications are storedas bit strings, which also define a number indicating how manyalternative bit strings are written into the target file at themodification concerned.
 8. A method of marking information to beproduced in order to expose unauthorized publication of saidinformation, the method comprising the steps of: defining a useridentifier, creating a source file containing said information to bemarked, and creating a different target file for each user on the basisof the source file and the user identifier, storing, in a file,predefined modification rules for modifying the information contained inthe source file, and, by means of a digital processor, generating asequence of modification keys on the basis of the user identifier, anddefining the locations in the source file to be modified and makingmodifications in these locations, determining at least one of the natureand position of the modifications on the basis of the predefinedmodification rules and modification keys; wherein a number of predefinedmodifications are stored in a separate modification file; furthercomprising storing the modification rules in a separate modificationfile as lists of alternative bit strings, and, when the source file ismodified to create a target file; recognizing, from the informationcontained in the source file, bit strings for which a list ofalternative bit strings has been defined, and in response to therecognition, replacing the bit string of the source file with at leastone alternative bit string obtained from the list of bit strings thatare alternative to the bit string of the source file.
 9. A method ofmarking information to be produced in order to expose unauthorizedpublication of said information, the method comprising the steps of:defining a user identifier, creating a source file containing saidinformation to be marked, and creating a different target file for eachuser on the basis of the source file and the user identifier, storing,in a file, predefined modification rules for modifying the informationcontained in the source file, and, by means of a digital processor,generating a sequence of modification keys on the basis of the useridentifier, and defining the locations in the source file to be modifiedand making modifications in these locations, determining at least one ofthe nature and position of the modifications on the basis of thepredefined modification rules and modification keys; wherein a number ofpredefined modification rules are stored in a modification program; andwherein, in a direction in which the information passes from aninformation provider to the user, bits derived from the modificationkeys are interleaved with bits of information contained in the sourcefile, and in the reverse direction the bits added by interleaving areignored.
 10. A system for marking information to be produced in order toexpose unauthorized publication of information, the system comprisingmeans for defining a user identifier, a source file containing theinformation to be marked, and processing means for creating a differenttarget file for each user on the basis of the source file and the useridentifier, a file storing predefined modification rules in order tomodify the information contained in the source file, processing meansfor generating a sequence of modification keys on the basis of the useridentifier, and that the processing means are arranged to define thelocations of the source file to be modified, and to make modificationsin these locations, wherein at least on of the nature and position ofthe modification is determined on the basis of the modification rulesand modification keys, wherein the source file comprises a first partcontaining actual information, and a second part containing formattinginstructions, and that modifications are made in said first part.